A Step-wise Usage-based Method for Inducing Polysemy-aware Verb Classes

نویسندگان

  • Daisuke Kawahara
  • Daniel Peterson
  • Martha Palmer
چکیده

We present an unsupervised method for inducing verb classes from verb uses in gigaword corpora. Our method consists of two clustering steps: verb-specific semantic frames are first induced by clustering verb uses in a corpus and then verb classes are induced by clustering these frames. By taking this step-wise approach, we can not only generate verb classes based on a massive amount of verb uses in a scalable manner, but also deal with verb polysemy, which is bypassed by most of the previous studies on verb clustering. In our experiments, we acquire semantic frames and verb classes from two giga-word corpora, the larger comprising 20 billion words. The effectiveness of our approach is verified through quantitative evaluations based on polysemy-aware gold-standard data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering Polysemic Subcategorization Frame Distributions Semantically

Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguated corpus data. We describe a new approach which involves clustering subcategorization frame (SCF) distributions using the Information Bottleneck and nearest neighbour methods. In contrast to previous work, we particularly focus on clustering polysemic verbs. A novel evaluation schem...

متن کامل

Semi-automatic Induction of Systematic Polysemy from WordNet

This paper describes a semi-automatic method of inducing underspecified semantic classes from WordNet verbs and nouns. An underspecified semantic class is an abstract semantic class which encodes systematic polysem~f, a set of word senses that are related in systematic and predictable ways. We show the usefulness of the induced classes in the semantic interpretations and contextual inferences o...

متن کامل

Verb polysemy and frequency effects in thematic fit modeling

While several data sets for evaluating thematic fit of verb-role-filler triples exist, they do not control for verb polysemy. Thus, it is unclear how verb polysemy affects human ratings of thematic fit and how best to model that. We present a new dataset of human ratings on high vs. low-polysemy verbs matched for verb frequency, together with high vs. low-frequency and well-fitting vs. poorly-f...

متن کامل

A principled Cognitive Linguistics account of English phrasal verbs with up and out *

Many attempts have been made to discover some systematicity in the semantics of phrasal verbs. However, most research has investigated the semantics of particles exclusively; no study has examined how the multiple meanings of the verb also contribute to the meanings of phrasal verbs. The current corpus-based (COCA) study advances the research on phrasal verbs by examining the interaction of the...

متن کامل

Synergetic Properties of Chinese Verb Valency

This paper analyses the 500 most frequent verbs in contemporary Chinese and investigates their synergetic properties. The results show that the rank-frequency distributions of both valency and polysemy abide by a power-law distribution and that valency and polysemy of these verbs abide by the Good distribution and the positive negative binomial distribution respectively. Statistical analysis in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014